Skip to content

[quantization] Introduce Qwen3VLVisionMLP wrapper#485

Merged
mhs4670go merged 3 commits intoSamsung:mainfrom
stamalakhov:vision_mlp_pr
Feb 13, 2026
Merged

[quantization] Introduce Qwen3VLVisionMLP wrapper#485
mhs4670go merged 3 commits intoSamsung:mainfrom
stamalakhov:vision_mlp_pr

Conversation

@stamalakhov
Copy link
Contributor

This commit adds Qwen3VLVisionMLP wrapper and tests for it.

┌───────────── Quantization Error Summary ─────────────
│ Mean |diff|: 0.016181
│ PEIR       : 1.979650 %
└──────────────────────────────────────────────────────
    ┌────────────────────────────────────────────┐
 7.6┤                                            │
    │                                        ••  │
    │                                      ••    │
 5.1┤                                    •••     │
    │                                 ••••       │
    │                               ••••         │
    │                             ••••           │
 2.7┤                           ••••             │
    │                         ••••               │
    │                       ••••                 │
 0.3┤                     ••••                   │
    │                   ••••                     │
    │                 ••••                       │
    │               ••••                         │
-2.1┤             •••                            │
    │           ••••                             │
    │         •••                                │
-4.6┤       •••                                  │
    │     •••                                    │
    │   •••                                      │
    │  ••                                        │
-7.0┤                                            │
    └┬──────────┬──────────┬─────────┬──────────┬┘
   -7.0       -3.3        0.3       3.9       7.6 

Quantized Circle model saved to /mnt/storage/slow_repos/VLM_TICO/TICO/qwen3vl_vision_mlp.q.circle
./ccex test -k quantization.wrapq.wrappers.qwen_vl.test_quant_vision_mlp

RUN unit tests with -k quantization.wrapq.wrappers.qwen_vl.test_quant_vision_mlp ...
test_mode_and_forward (quantization.wrapq.wrappers.qwen_vl.test_quant_vision_mlp.TestQuantQwenVisionMLP.test_mode_and_forward) ... ok
test_calib_quant_export (quantization.wrapq.wrappers.qwen_vl.test_quant_vision_mlp.TestSubgraphExport.test_calib_quant_export) ... ok

----------------------------------------------------------------------
Ran 2 tests in 0.893s

OK

Draft: #484

@stamalakhov stamalakhov requested review from a team and mhs4670go February 12, 2026 05:10
@stamalakhov stamalakhov self-assigned this Feb 12, 2026
@stamalakhov stamalakhov force-pushed the vision_mlp_pr branch 2 times, most recently from bbd856b to e280ab2 Compare February 12, 2026 05:37
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe AutoModelForVision2Seq is included in recent transformer versions.
Could you elaborate your torch and transformers versions on any document?

We need to convert examples into CI someday. It will be helpful to write down the versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dayo09
Ahhh. It occured that AutoModelForVision2Seq is removed in v5.0. I believe i should change it to AutoModelForImageTextToText, which is used in transformers since 4.5 version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dayo09
Fixed. Am i supposed to support AutoModelForVision2Seq for versions below 4.5?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stamalakhov Thanks so much!

Fixed. Am i supposed to support AutoModelForVision2Seq for versions below 4.5?

No, I don't think so. I believe just mentioning the version is enough for now. It's just for later test. We are preparing for Qwen3-VL model structure's quantization/frontend-compilation possibility. It's not about deployment level yet. :-D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, could you share your torch and transformers verisons? I encounter this error by running the example. 😓

Loading weights: 100%|████████████████████████████████████████████████████████████████████████████| 625/625 [00:00<00:00, 1913.24it/s, Materializing param=model.visual.pos_embed.weight]
Traceback (most recent call last):
  File "/home/dayo/git/TICO/tico/quantization/wrapq/examples/quantize_qwen_vision_mlp.py", line 74, in <module>
    int8_out = mlp_q(hidden)
  File "/home/dayo/miniconda3/envs/py310-tvm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/dayo/miniconda3/envs/py310-tvm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/dayo/git/TICO/tico/quantization/wrapq/wrappers/ptq_wrapper.py", line 46, in forward
    return self.wrapped(*args, **kwargs)
  File "/home/dayo/miniconda3/envs/py310-tvm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/dayo/miniconda3/envs/py310-tvm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/dayo/git/TICO/tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_mlp.py", line 76, in forward
    fc1 = self.linear_fc1(x_q)
  File "/home/dayo/miniconda3/envs/py310-tvm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/dayo/miniconda3/envs/py310-tvm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/dayo/git/TICO/tico/quantization/wrapq/wrappers/ptq_wrapper.py", line 46, in forward
    return self.wrapped(*args, **kwargs)
  File "/home/dayo/miniconda3/envs/py310-tvm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/dayo/miniconda3/envs/py310-tvm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/dayo/git/TICO/tico/quantization/wrapq/wrappers/nn/quant_linear.py", line 57, in forward
    w = self.obs_weight.fake_quant(w)
  File "/home/dayo/git/TICO/tico/quantization/wrapq/observers/affine_base.py", line 152, in fake_quant
    return torch.fake_quantize_per_channel_affine(
RuntimeError: !needs_dynamic_casting<func_t>::check(iter) INTERNAL ASSERT FAILED at "/pytorch/aten/src/ATen/native/cpu/Loops.h":311, please report a bug to PyTorch. 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't think so. I believe just mentioning the version is enough for now. It's just for later test. We are preparing for Qwen3-VL model structure's quantization/frontend-compilation possibility. It's not about deployment level yet. :-D

@dayo09
Got it. Thank you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stamalakhov Ah, it was about the module's dtype==torch.bfloat16

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, could you share your torch and transformers verisons? I encounter this error by running the example. 😓

Transformers ~ 4.57.6, torch ~ 2.10.0

This commit adds Qwen3VLVisionMLP wrapper and tests for it.

TICO-DCO-1.0-Signed-off-by: s.malakhov <s.malakhov@partner.samsung.com>
Apply suggestions from code review

Co-authored-by: Dayoung Lee <dayoung.lee@samsung.com>
TICO-DCO-1.0-Signed-off-by: s.malakhov <s.malakhov@partner.samsung.com>
@stamalakhov stamalakhov requested a review from dayo09 February 12, 2026 08:09
Copy link
Contributor

@dayo09 dayo09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@mhs4670go mhs4670go left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mhs4670go mhs4670go merged commit 3dc76ec into Samsung:main Feb 13, 2026
7 checks passed
@stamalakhov stamalakhov deleted the vision_mlp_pr branch February 13, 2026 04:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments